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Abstract 

We describe a new variational lower-bound 
on the minimum energy configuration of a 
planar binary Markov Random Field (MRF). 
Our method is based on adding auxiliary 
nodes to every face of a planar embedding 
of the graph in order to capture the effect of 
unary potentials. A ground state of the re- 
sulting approximation can be computed effi- 
ciently by reduction to minimum-weight per- 
fect matching. We show that optimization 
of variational parameters achieves the same 
lower-bound as dual-decomposition into the 
set of all cycles of the original graph. We 
demonstrate that our variational optimiza- 
tion converges quickly and provides high- 
quality solutions to hard combinatorial prob- 
lems 10-lOOx faster than competing algo- 
rithms that optimize the same bound. 



1 Introduction 

Dual-decomposition methods for optimization have 
emerged as an extremely powerful tool for solving 
combinatorial problems in graphical models. These 
techniques can be thought of as decomposing a com- 
plex model into a collection of easier-to-solve compo- 
nents, providing a variational bound which can then 
be optimized over its parameters. A wide variety of 
algorithms have been proposed, often distinguished 
by the class of models from which subproblems are 



constructed, including trees (Wainwright et al. 2005 



Jaakkola 2007 



Kolmogorov 20061, planar graphs (Globerson and 



outer-planar graphs (Batra et al. 



2010), k-fans (Kappes et al. 2010), or some more 



heterogeneous mix of combinatorial subproblems (e.g., 



Torresani et al. 2008 1 



While the class of tree-reweighted methods are now 
fairly well understood, many of the same concepts and 



guidance available for trees are not available for more 
general classes of decompositions. In this paper, we 
analyze reweighting methods that seek to decompose 
binary MRFs into subproblems consisting of tractable 
planar subgraphs. We show that the ultimate build- 
ing blocks of such a decomposition are simple cycles 
of the original graph and that to achieve the tightest 
possible bounds, one must choose a set of subproblems 
that cover all such cycles. Cycles in planar-reweighted 
decomposition thus play a role analogous to trees in 
tree-reweighted decompositions. 

There are various techniques for enforcing consistency 
over cycles in an MRF. For example, one can tri- 
angulate the graph and introduce constraints over 
all triplets in the resulting triangulation. However, 
this involves 0{n^) constraints which is impractical 
in large-scale inference problems. A more efficient 
route is to only add a small number of constraints as 
needed, e.g., using a cutting-plane approach (Sontag 



and Jaakkola 2007). 



The contribution of this paper is a graphical construc- 
tion for a new variational bound that enforces the con- 
straints over all cycles in a planar binary MRF with 
only a constant factor overhead. This representation is 
very simple and efficient to optimize, which we demon- 
strate in experimental comparisons to existing state- 
of-the-art, cycle-enforcing methods where we achieve 
substantial performance gains. 

2 Exact Inference for Binary 
Outer-planar MRFs 

Consider the energy function E{X) associated with a 
general binary MRF defined over a collection of vari- 
ables (Al, A2, . . .) e {0, 1}^ with specified unary and 
pairwise potentials. It is straightforward to show that 
any such MRF can be reparametrized up to a con- 
stant using pairwise disagreement costs Oij along with 



unary parameters 9i (see, e.g., Kolmogorov and Zabih 



2004 Schraudolph and Kamenetsky 2008). The en- 




Figure 1: (a) shows a standard planar MRF which is represented by an energy function containing unary and 
pairwise potentials (b) shows an equivalent MRF in which the unary terms have been replaced by an auxiliary 
node (square). Both (a) and (b) are intractable in general, (c) shows a decomposition which gives a lower-bound 
on the ground-state of (a) by using a collection of outer-planar graphs whose ground states can be computed 
efficiently using minimum-weight perfect matching, (d) shows the new lower-bound construction introduced in 
this paper which uses multiple auxiliary nodes, one for each face of the original graph. 



ergy function can thus be written as 

i>j i 

where [•] is the indicator function and we have dropped 
any constant termsj^ 

We can express such an energy function without in- 
cluding any unary terms by introducing an auxiliary 
variable Xq and replacing the unary terms with pair- 
wise connections to so that 

i>j i 

If we fix Xq = 0, then Ei is clearly equivalent to our 
original energy function E. Since the potentials in Ei 
are symmetric, for any state X = {Xo,Xi, . . .), there 
is a state X with identical energy, given by flipping 
the states of every Xi including Xq. Thus any X that 
minimizes Ei can be easily mapped to a minimizer of 
E. 

Minimizing the energy function Ei can be interpreted 
as the problem of finding a bi-partition of a graph Qi 
which has a vertex i corresponding to each variable 
Xi and edges for any pair (i, j) with 9ij ^ 0. The cost 
of a partition is simply the sum of the weights 9ij of 
edges cut. Given a minimal weight partition, we can 
find a corresponding optimal state X by assigning all 
the nodes in the partition containing Xq to state and 
the complement to state 1. Since the edge weights 9ij 
may be negative, such a minimal weight cut is typically 
non-empty. 

While minimizing E{X, 9) is computationally in- 



struction due to Kasteleyn (1961 19671 and Fisher 



tractable in general (Barahona 1982), a clever con 



^We assume in the rest of this paper that all MRFs 
are parameterized in this manner. In particular an MRF 
without unary parameters is one in which all the pairwise 
terms are symmetric. 



( 1961 1966 1 allows one to find minimizing states when 



the graph corresponding to Ei is planar. This is based 
on the complementary relation between states of the 
nodes X and perfect matchings in the so-called ex- 
panded dual of the graph Qi. A minimizing state for a 
planar problem can thus be found efficiently, e.g. us- 



ing Edmonds' blossom algorithm (Edmonds 1965) to 
compute minimum-weight perfect matchings We use 



the Blossom V implementation of Kolmogorov ( 2009 ) 



which is quite efficient in practice, easily handling 
problems with a million nodes in a few seconds. Fur- 
thermore, for planar problems, one can also compute 
the partition function associated with E in polynomial 
time. See the report of [Schraudolph and Kamenetsky] 
( 2008 ) for an in-depth discussion and implementation 



details. 

While this reduction to perfect matching provides a 
unique tool for energy minimization and probabilis- 
tic inference, the requirement that Gi be planar is a 
serious restriction. In particular, even if the original 
graph Q corresponding to E is planar, e.g., in the case 
of the grid graphs commonly used in computer vision 
applications, Gi is typically not, since the addition of 
edges from every node to the auxiliary node Xq ren- 
ders the graph non-planar. Assuming arbitrary values 
of 9i, those energy functions E to which this method 
can be applied are exactly the set whose graphs G are 
outer-planar. An outer-planar graph is a graph with 
a planar embedding where all vertices share a com- 
mon face (e.g., the exterior face). For such a graph, 
every vertex can be connected to a single auxiliary 
node placed inside the common face without any edges 
crossing so that the resulting graph Gi is still planar. 



^Matchings in planar graphs can be found somewhat 
more efficiently than for general graphs which yields the 
best known worst-case ru nning time of 0(N ^^^ log'-^) for 
max-cut in planar graphs (Shih et al. 19901. 



See examples in Figure T[ 



3 Inference with Dual Decomposition 

Dual decomposition is a general approach for leverag- 
ing such islands of tractability in order to perform in- 
ference in more general MRFs. The application of dual 
decomposition to inference in graphical models was 
popularized by the work of Wainwright et al. ( 2003[ 



[2005 ) on Tree-Reweighted Belief Propagation (TRW). 
TRW finds an optimal decomposition of an MRF into 
a collection of tree-structured problems where exact 
inference is tractable. More formally, let t index a col- 
lection of subproblems defined over the same set of 
variables X and whose parameters sum up to the orig- 
inal parameter values, so that 9 = J^t energy 
function is linear in 6 so we have 

Emap ^rmnE{X,Q) ^■[mnJ2EiX,Q^) (3) 



constraints enforced by the structure of each subprob- 
lem. For the tree-structured subproblems of TRW, 
this relaxation results in the so-called local polytope 
L(C/) which enforces marginalization constraints on 
each edge. Since L(^) is an outer bound on M[(fJ), min- 
imization yields a lower-bound on the original prob- 
lem. For any relaxed set of constraints, the values 
of fi may not correspond to the min-marginals of any 
valid distribution, and so are referred to as pseudo- 
marginals. 

One can tighten the bound in Equation [4] by adding 
additional subproblems to the primal (or equivalently 
constraints to the dual) which enforce consistency over 
larger sets of variables. This has been explored, e.g. by 



Sontag and Jaakkola ( 2007 1 who suggest adding cycle 



inequalities to the dual which enforce consistency of 
pseudo-marginals around a cycle. Since there are a 
large number of potential cycles present in the graph, 
Sontag suggests either using a cutting plane algorithm 



to successively add violated cycle constraints (Sontag 



The inequality arises because each subproblem t is 
solved independently and thus may yield different so- 
lutions. On the other hand, if the solutions to the sub- 
problems all happen to agree then the bound is tight. 
The problem of maximizing the lower-bound over pos- 
sible decompositions {6*} is convex and when infer- 
ence for each sub-problem is tractable (for example, 
6** is tree-structured) the bound can be optimized ef- 
ficiently using message passing (fixed-point iterations) 
based on computing min-marginals in each subprob- 



> max y^mini?(X* 0*) (4) and Jaakkola 2007 ) or to only add small cycles such as 

9^—9 ^ f T"iT\l of c r»T" nm QrlT-iiTilci-f o /|CIrin-<- q nr of q1 OHOSh f Vi cif ncin 



triplets or quadruplets ( Sontag et al. , 2008 ) that can 



be enumerated with relative ease and optimized using 
local message passing rather than general LP solvers. 

For binary problems, it is natural to consider replacing 
Wainwright's tree subproblems with tractable outer- 



planar subgraphs. This has been explored by Glober- 



son and Jaakkola (2007) and Batra et al. (2010) who 



lem (Wainwright et al. 2003) or by projected subgra- 



dient methods (Komodakis et al. 2007) 



A powerful tool for understanding the minimization 
in Equation [4] is to work with the Lagrangian dual. 
Equation [3] is an integer linear program over X, but 
the integrality constraints can be relaxed to a linear 
program over continuous parameters fi representing 
min-marginals which are constrained to lie within the 
marginal polytope, fi G M(C/). The set of constraints 
that define M{Q) are a function of the graph struc- 
ture G and are defined by an (exponentially large) set 
of linear constraints that restrict fi to the set of min- 
marginals achievable by some consistent joint distri- 
bution (see Wainwright and Jordan 2008). Lower- 



proposed decomposing a graph into a set of planar 
graphs for the purposes of estimating the partition 
functiorj^ and minimum energy state respectively. For 
energy minimization, it is well-known that any set 
of subproblems that cover every edge is sufficient to 
achieve the TRW bound; but what is the best set 
of planar graphs to use? Is it necessary to use all 
outer-planar or even all planar subgraphs? It turns 
out that the set of all outer-planar or planar sub- 
graphs is equivalent to the set of all cycle constraints 
in G, which can be enforced by any so-called cycle 
basis of the graph. This observation leads to algo- 
rithms such as reweighted perfect matching ( Schrau- 



dolph 2010), which explicitly constructs a set of sub- 



problems that form a complete cycle basis, or incre- 



mental algorithms to enforce cycle constraints ( Sontag 



and Jaakkolal [20071 [Sontag et al] [20081 [Komodakis 
and Paragios 2008 1 . 



bounds of the form in Equation [4[ correspond to re- 
laxing this set of constraints to the intersection of the 



^Note that outer-planar graphs have treewidth two and 
hence the minimum energy solution can also be found ef- 
ficiently using the standard junction tree algorithm. How- 
ever, the reduction to matching is still of interest for gen- 
eral planar graphs without unary potentials, which have a 
treewidth of O(^). 



In the following sections, we focus on the case in which 
the original MRF is planar but the addition of the aux- 
iliary unary node makes it non-planar. We describe 
a novel, compactly expressed variational approxima- 



''More precisely, Globerson and Jaakkola ( 2007| con- 
sider the inclusion of any binary, planar subgraph of Qi. 
This may include subgraphs with treewidth greater than 
two. 



tion. We then prove that it achieves as tight a bound 
as decomposition into any collection of cycles or outer- 
planar graphs. This also gives a relatively simple proof 
that the tightest bounds achievable by sets of planar, 
outer-planar, or cycle subproblems are equivalent, and 
that the set of subproblems that are necessary and suf- 
ficient to achieve this bound form a cycle basis, i.e., 
cover every chordless cycle in the original graph at 
least once. 

4 Planar Cycle Coverings 

Consider a planar embedding of the graph Q corre- 
sponding to an MRF. Since we cannot directly connect 
the unary node Xq to every node in the graph without 
losing planarity, we propose the following relaxation. 
For each face f oi G add an independent copy of the 
unary node Xq and connect it to all vertices on the 
boundary of the face with weig hts 9{. Let Ni be the 
set of unary node copies attached to node i. We split 
the original unary potential 9i across all the unary face 
nodes connected to i while maintaining the constraint 
that J2feNi — ^i' Figure lid). Using this sys- 
tem we have the following relaxation 

Emap = min ^ 0,, [X, ^ X^] + ^ e{ [X, ^ X^] 

X'.Xr, — ^'^n ■ j: 

> mm J2 ^ + E ^/ ^ ^0 ] 

(5) 

The inequality arises because we have dropped the 
constraint that all copies of take on the same value. 
On the other hand, since the graph corresponding to 
the relaxation in Equation[5]is planar, we can compute 
the minimum exactly. Furthermore, we have freedom 
to adjust the 9( parameters so long as they sum up 
to our original parameters. This yields the variational 
problem 

Epcc^ max min^ 0,, [X, ^/[X, 

(6) 

where Emap > Epcc- We refer to this construction 
as a planar cycle covering of the original graph since 
the singular potentials for each face cycle are covered 
by some auxiliary node (and as we shall see, all other 
cycles also are covered in a precise sense). Although 
this planar decomposition includes duplicate copies of 
nodes from the original problem, it differs in that there 
are not multiple independent subproblems but just a 
single, larger planar problem to be solved. This is in 
some ways analogous to the work of |Yarkony et al. 
( |2010 ) which replaces the collection of spanning trees 
in TRW with a single "covering tree" . 



As with dual decomposition, the parameters may be 
optimized using subgradient or marginal fixed-point 
updates. For example, the subgradient updates for d{ 
at a given setting of X can be easily computed by tak- 
ing a gradient and enforcing the summation constraint. 
This yields the update rule 

d{ -0{ + X (^[X, + Xl\ - ^ [X, ^ j (7) 

where |iVj| is the number of auxiliary face nodes at- 
tached to Xi and A is a stepsize parameter. After each 
such gradient step, one must recompute the optimal 
setting of X which can be done efficiently using per- 
fect matching. 

The subgradient update lends itself to a simple in- 
terpretation. If Xq disagrees with Xi but the other 
neighboring copies {ATq} do not, then the cost for Xq 
and Xi disagreeing is increased. On the other hand, 
if all the copies {Xq} take on the same state then the 
update leaves the parameters unchanged. 

5 Cycle Decompositions and Cycle 
Covering Bounds 

In this section, we show that the planar cycle cover 
bound Epcc for any planar binary MRF G is equiv- 
alent to the lower-bound given by decomposition into 
the collection of all cycles of G- 

For a given planar binary MRF with graph G, consider 
the bound Ecycle given by decomposing the MRF 
into the collection of all cycles of G- By optimizing 
the allocation of parameters across these subproblems 
one produces a lower-bound that is generally tighter 
than that given by TRW and related algorithms since 
the subproblems can correctly account for the energy 
of frustrated cycles that is approximated in the tree- 
based bound. In fact, for planar graphs without unary 
potentials adding cycle subproblems is enough to make 
the lower-bound tight. 

Lemma 5.1 The lower-bound Ecycle given by the 
optimal cycle decomposition of a planar MRF with no 
unary potentials is tight. 

For such an MRF the set of states corresponds exactly 
with the set of edge incidence vectors representing cuts 
in the graph. The convex hull of this set is known 
as the cut polytope. The connection between the cut 
polytope and the cycle decomposition is seen by taking 
the Lagrangian dual of the lower-bound optimization 
which yields a constrained optimization of the edge 
incidence vectors (pseudo-marginals) over a polytope 
defined by cycle inequalities. For planar graphs (or 



Figure 2: Demonstration that the minimal energy of a cycle is equal to the maximum lower-bound given by an 
approximation in which unary potentials are represented by a decoupled set of auxiliary variables (squares). At 
optimality of the variational parameters, all six cuts depicted must have equal energies and thus it is possible to 
choose a ground-state in which all the duplicate copies of the auxiliary node are in the same state. 



more generally graphs containing no K^, minor), the 
set of cycle inequalities is sufficient to completely de- 
scribe the cut polytope. See |Barahona and Mahjoub 
(|1986p for proof and related discussion by Sontag and 



Jaakkola ( 2007 ) . Just as local edge consistency implies 
global consistency for a tree, cycle consistency implies 
global consistency for a planar binary MRF without 
unary potentials. 

While the number of simple cycles grows exponentially 
in the size of the graph for general planar graphs, it 
is still possible to solve such a problem in polynomial 
time. It is not in fact necessary to include every cy- 
cle subproblem but simply a subset which form a cy- 



cle basis (Barahona 1993). Furthermore, there exists 



olated cycle ( 


Barahona and Mahjoub 


1986 


I . Sontag 


and Jaakkola 


( 


2007| use this as the basis for a cut- 



ting plane method which successively adds cycle con- 
straints to the dualli] 

We would now like to consider cycles in MRFs which 
do have unary potentials. We start with the simplest 
case of a single cycle. 

Lemma 5.2 The minimum energy of a single cycle is 
the same as the maximum lower-bound given by the 
graph in which the unary potentials have been replaced 
by a collection of auxiliary nodes ( one for each edge in 
the cycle) where each node in the cycle is connected to 
the pair of auxiliary nodes corresponding to its incident 
edges. 

Proof Sketch. Figure [2] provides a visualization of the 
set of auxiliary nodes (squares) added to the cycle (cir- 
cles). We refer to this as the "saw" graph. Suppose 
we have optimized the decomposition of unary param- 
eters across the auxiliary node connections to maxi- 



^It is important to note that a cycle basis for Q\ is 
not sufficient to achieve the bound Ecycle given by the 
collection of all cycles in Q since a cycle in Q corresponds 
to a wheel in Qi . 



mize the lower-bound. We claim that at the optimal 
decomposition, there always exists a minimal energy 
configuration such that all the auxiliary nodes take on 
state 0, making the bound equivalent to the cycle with 
a single auxiliary node. 

Suppose we choose a minimum energy configuration 
of the graph but the duplicate auxiliary nodes take 
on mixed states. Start at some point along the cycle 
where there is an auxiliary node in state and proceed 
clockwise until we find an auxiliary node in state 1. As 
we continue around the cycle we will encounter some 
later point at which the auxiliary nodes return to being 
in state 0. This is most easily visualized in terms of 
the cut separating and 1 nodes as shown in Figure 



an efficiently computable witness for identifying a vi- O 



Let Xi be the first node which is attached to a pair 
of disagreeing auxiliary nodes Xq,Xq and Xj be the 
second attached to Xq , Xq . Consider the four possible 
cuts highlighted in red and green in Figure |2] At the 
optimal decomposition of the parameters, it must be 
the case that these paths have equal costs. If not, 
then we could transfer weight (e.g. from 9f to 0,^) 
and increase the energy, contradicting optimality. Let 

Ci = {9^c + Of) = {e,d + e\) and C2 = {e,h + of) = 

{9jg+6j). If one of the four cuts shown is minimal then 
it must be that C1+C2 < 0, otherwise the path which 
cuts none of these edges (orange) would be preferred. 
However, if Ci -I- C2 < then there is yet another cut 
(blue) which would achieve an energy that is lower by 
a non-zero amount (Ci -I- C2) by cutting both sets of 
edges. Therefore, it must be the case that Ci + C2 = 
and thus either orange or blue cuts also represents 
a minimal configuration that leaves the collection of 
auxiliary nodes in state 0. A similar line of argument 
works for the cases when Xc ~ I or Xh = 1 or both. 

We are thus free to flip the states of the block of dis- 
agreeable auxiliary nodes and their neighbors on the 
cycle without changing the energy. We can then con- 
tinue around the cycle in this manner until all copies 



of the auxihary nodes are in state as desired. □ 

We are now ready to give the main result of this sec- 
tion. 

Theorem 5.3 The lower-bound given by the planar 
cycle covering graph is equal to the lower-bound given 
by decomposition into the collection of all cycles so that 

EpcC = EcYCLE- 

Proof Sketch. We proceed by showing a circular se- 
quence of inequalities. Figure [3] provides a graphical 
overview. Take the set of cycles which yield the bound 
EcYCLE- We can apply Lemma |5.2| to transform each 
cycle subproblem into a corresponding "saw" contain- 
ing an auxiliary node for each edge while maintain- 
ing the bound. We then observe that every such aug- 
mented cycle is a subgraph of the planar cycle cov- 
ering graph. As with any such decomposition into 
subgraphs, the minimal energy of the cycle covering 
graph must be at least as large as the sum of the min- 
imal subgraph energies and hence Ecycle < Epcc- 
On the other hand, since the PCC graph is now a pla- 
nar binary MRF with no unary terms, by Lemma |5.1| 
we can decompose it exactly into the collection of 
its constituent cycles with no loss in the bound. Fi- 
nally each of these cycles is itself a subgraph of some 
augmented cycle and hence we must also have that 
Ecycle > Epcc, proving equality. □ 



Batra et al. (20101 and Globerson and Jaakkola (2007j) 
both propose decomposing a binary MRF into a set of 
tractable planar graphs. Based on the previous result, 
we can clearly see that the best achievable bound un- 
der such a decomposition must include a subproblem 
that covers every chordless cycle in the original graph. 
If consistency along a particular cycle is not enforced 
we can always arrange parameters so that the resulting 
bound is arbitrarily bad. We also show the converse, 
that outer-planar decomposition can do no better than 
the set of cycles. 

Corollary 5.4 The best lower-bound achieved by any 
outer-planar decomposition for a planar MRF is no 
larger than Epcc- 

Proof Sketch. Take any outer-planar decomposition of 
a planar MRF. We first note that an outer-planar 
graph may be decomposed into a forest of blocks con- 
sisting of either biconnected components or individual 
edges, where blocks are connected by single vertices 
(cut vertices). Each biconnected component in turn 
has a dual graph which is a tree, meaning it consists 
of face cycles which have one edge in common (see e.g.. 



split them, we introduce copies Xf Xf of the cut ver- 
tex which are allowed to take on independent states. 
The unary parameter 9i is shared between these two 
copies with the constraint that 9j + 9f — 9i. There 
exists an optimal decomposition of 9i which assures 
the two nodes share an optimizing configuration. For, 
suppose to the contrary that the optimal decompo- 
sition yielded a minimum energy configuration where 
X} and xf took on different states, say X} = and 
Xf = 1. Then, shifting weight from 9} to 9f would 
drive up the energy of such a disagreeing configura- 
tion, contradicting optimality of the decomposition. 

Once blocks have been split apart, we may apply es- 
sentially the same argument to split each biconnected 
component into its constituent face cycles. Consider 
the pair of neighboring nodes Xi^Xj which are split 
into Xl,Xf,X^, and X|. At the optimal decomposi- 
tion of the parameters 9i , 9j , 9i.j , it again must be the 
case that the copies of the duplicated edge must share 
at least one optimizing configuration. If not then the 
parameters could be redistributed by removing weight 
from one or more unused states in one copy and adding 
it to the set of optimizing states for the other copy. 
This would increase the energy and thus contradict 
optimality of the decomposition. 

Thus any outer-planar decomposition is equivalent to 
a bound given by the set of constituent cycles and 
edges. Every one of these subproblems is a subgraph 
of the cycle covering graph and so the bound can be 
no tighter than the PCC graph bound. □ 

6 Experimental Results 

We demonstrate the performance of the planar cycle 
cover bound on randomly generated Ising grid prob- 
lems, and compare against two state-of-the-art ap- 
proaches: max-product linear programming (MPLP) 



with incrementally added cycles (Sontag et al. 2008) 
and reweighted perfect matching (RPM) ( [Schrau- 
doIphllMol. 



Syslo ( 1979 1 for a more in-depth discussion) 



We first split apart the forest into blocks. Consider any 
pair of blocks connected at a single cut vertex Xi. To 



Each problem consists a grids of size NxN with pair- 
wise potentials drawn from a uniform distribution 
9ij ^ U{—1,1). The unary potentials are generated 
from a uniform distribution 9i ~ U{—a,a)^ where the 
magnitude a determines the difficulty of the problem. 
Large values are relatively easy to solve, since each 
variable has strong local information about its optimal 
value; as a becomes smaller the problems typically be- 
come more difficult. We generate three categories of 
problem, "easy" (a = 3.2), "medium" (a = 0.8), and 
"hard" (a = 0.2), and show the results on each class 
of problem separately. To make it easy to test conver- 
gence, we scaled the weights by 500 and rounded them 
to integers. Thus a gap of less than 1 between lower 




Figure 3: Graphical depiction of Theorem |5.3| demonstrating that the planar cycle covering graph enforces 
constraints over all cycles of the original graph, (a) depicts the lower bound Ecycle based on a decomposition 



into the collection of all simple cycles of the original graph. Lemma 5.2 shows that this bound is equivalent to 
the bound given by a corresponding collection of graphs (b) in which unary potentials are captured by multiple 
auxiliary nodes placed along each edge. Since every one of these graphs is a subgraph of the planar cycle covering 
graph (c) their minimum energy must be less than Epcc- Finally, since the planar cycle covering graph (c) has 
no unary potentials, it is equal to its collection of cycles which are themselves all subgraphs of (b). 



and upper bounds provides a certificate of optimality. 
We implemented the PCC bound using the Blossom 



V implementation of Kolmogorov and Zabih (2004 1. 
At each step t we obtain both a lower-bound EpQQ 
and a configuration oi X — [Xi,...,Xn] and the 
copies {Xq}. We compute the energy of two possi- 
ble joint solutions, X and its complement X, and save 
the best solution found so far and its energy ij* as 
a current upper bound. The variational parameters 
are updated using the projected sub-gradient given in 
Equation[7j and the step size A is chosen using Polyak's 
step size rule, i.e., given sub-gradient g{9) we choose 



A 



Epcc) 



The incremental update 



feature of Blossom V is used to speed up successive 
optimizations as the variational parameters are modi- 
fied. 

For both MPLP and RPM, we used the original au- 
thors' code available online. MPLP first runs an op- 
timization corresponding to the tree-reweighted lower 
bound (TRW), then successively tightens this bound 
by trying to identify cycles whose constraints are sig- 
nificantly violated and adding those subproblems to 
the collection. For grids, it enumerates and checks 
each square of four variables; we modified the code 
slightly to ensure that any given square is added only 
once. Because weak tree agreement can lead to subop- 
timal fixed points in MPLP, we tried both the standard 
message updates and a version which used subgradi- 
ent steps, but found little difference and report only 
the fixed point update results. We also note that be- 



cause this implementation of MPLP explicitly enumer- 
ates only a subset of cycles, the MPLP implementation 
may not provide the tightest possible lower-bound, an 
effect we observe in our experiments. 

For RPM, we used the author's implementation Isinf, 
which uses a bundle-trust optimization subroutine for 
its subgradient updates. IsInf does not compute up- 
per bounds (proposed solutions) frequently; in plots 
showing the change in bounds over time we modified 
the code to also return such a solution, but used the 
default behavior for our timing comparisons. 

Figure |4] shows the upper and lower bounds found by 
each algorithm as a function of time, for a single 32 x 32 
problem instance from each of the three categories. For 
the "easy" problem, all three methods find and verify 
the optimal solution (zero duality gap); in this case, 
MPLP converges more quickly than RPM, and PCC 
is faster still. For the "medium" problem, we see that 
MPLP converges more slowly and to a small duality 
gap, with RPM slightly faster and PCC still fastest. 
For the "hard" problem, MPLP has a large duality 
gap; in this case RPM and PCC still converge to and 
verify the optimum. In all cases, PCC is significantly 
faster than the other methods. 

Figure [5] shows timing results as a function of problem 
size for all three algorithms. Since each method may 
converge (return a provably optimal solution) on some 
problems but not others, we report two quantities: the 
geometric mean of the time over all problems for which 
the method converged (upper row), and the fraction of 



problems that the method siiceessfiiUy solvcid (lower 
row). As can be seen, PCC is significantly faster than 
the other two methods across both problem difficulty 
and size, and successfully solves a greater percentage 
of the problems. 

7 Discussion 

We have described a new variational bound for per- 
forming inference in planar binary MRFs. Our bound 
subsumes those given by both the trec-reweighted 
(TRW) and outer-planar decompositions of such a 
graph since it implicitly includes every edge and cycle 
as a sub-problem. Unlike approaches such as MPLP 
which successively add cycles, we are able to get the 
full benefit of all cycle constraints immediately. As a 
result we achieve fast convergence in practice. 

The PCC graph boimd is limited to planar binary 
problems. We are currently exploring routes to remove 
these limitations. For example, in general non-planar 
graphs, we can triangulate the graph to get a cycle ba- 
sis of triangles and then "glue" those triangles together 
into the smallest possible planar graph. In addition to 
MAP inference, it will also be interesting to see how 
the PCC graph relates to variational approximations 
to the marginals. 
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Figure 4: Average convergence behavior of lower- and upper-bounds for randomly generated 32x32 Ising grid 
problems. We compare PCC, the planar cycle cover bound (blue) to RPM (green) and MPLP (red) for easy, 
medium and hard problems. The problem difSculty is controlled by the relative influence of unary and pairwise 
potentials. Energies are averaged over 10 random problem instances and plotted relative to a MAP energy of 0. 



Easy 



Medium 



Hard 



10" 



10= 



10' 



£,10 

E 10' 



10 



10" 




- RPM 
-MPLP 



10" 



10= 



10' 



E.10 
E 10' 



10' 



10" 




10" 



10= 



10' 



£10 
E 10' 



10' 



10" 




-PCC 
-RPM 
-MPLP 



16 



32 



64 128 



16 32 64 128 



16 32 64 128 



T3 

S 1 



I 0.5 
o 




16 32 
Size 



64 



128 



5 0.5 



2 




16 32 64 128 
Size 



5 0.5 



2 




16 32 64 128 
Size 



Figure 5: Convergence times as a function of problem size for randomly generated Ising grid problems. We 
compare PCC (blue) to RPM (green) and MPLP (red) for easy, medium and hard problems. We record times 
for upper- and lower- bounds to converge averaged over 10 problem instances. We only include in the average 
convergence time those problem instances for which an algorithm was able to find the MAP configuration (a 
duality gap of less than 1). The second row of plots shows in each case the fraction of problems for which this 
happened. 



